Linux tools for text processing (COMPLECS)

Remote event

Many computational and data processing workloads require pre-processing of input files to get the data into a format that is compatible with the user’s application and/or post-processing of output files to extract key results for further analysis. While these operations could be done by hand, they tend to be time-consuming, tedious and, worst of all, error prone. In this session we cover the Linux tools awk, sed, grep, sort, head, tail, cut, paste, cat and split, which will help users to easily automate repetitive tasks. We conclude by showing how large language models (LLMs) such as ChatGPT could be used to write commands using these tools.

Instructor

Nicole Wolter

Computational and Data Science Research Specialist, SDSC

Nicole Wolter is a Computational and Data Science Research Specialist in the High-Performance Computing User Services Group at SDSC. She currently manages the accounts and allocations and provides user support for the three HPC systems at SDSC. Nicole graduated from San Diego State University with a degree in Computer Science in 2001. She is currently involved in working with and helping users porting their AI applications to SDSC’s NSF funded AI supercomputer - Voyager.